Parallel parsing made practical

نویسندگان

  • Alessandro Barenghi
  • Stefano Crespi-Reghizzi
  • Dino Mandrioli
  • Federica Panella
  • Matteo Pradella
چکیده

The property of local parsability allows to parse inputs through inspecting only a bounded-length string around the current token. This in turn enables the construction of a scalable, data-parallel parsing algorithm, which is presented in this work. Such an algorithm is easily amenable to be automatically generated via a parser generator tool, which was realized, and is also presented in the following. Furthermore, to complete the framework of a parallel input analysis, a parallel scanner can also combined with the parser. To prove the practicality of a parallel lexing and parsing approach, we report the results of the adaptation of JSON and Lua to a form fit for parallel parsing (i.e. an operator-precedence grammar) through simple grammar changes and scanning transformations. The approach is validated with performance figures from both high performance and embedded multicore platforms, obtained analyzing real-world inputs as a test-bench. The results show that our approach matches or dominates the performances of production-grade LR parsers in sequential execution, and achieves significant speedups and good scaling on multicore machines. The work is concluded by a broad and critical survey of the past work on parallel parsing and future directions on the integration with semantic analysis and incremental parsing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Massively Parallel Memory-Based Parsing

This paper discusses a radically new scheme of natural language processing called massively parallel memory-based parsing. Most parsing schemes are rule-based or principle-based which involves extensive serial rule application. Thus, it is a time consuming task which requires a few seconds or even a few minutes to complete the parsing of one sentence. Also, the degree of par-allelism attained b...

متن کامل

PAPAGENO: A Parallel Parser Generator for Operator Precedence Grammars

In almost all language processing applications, languages are parsed employing classical algorithms (such as the LR(1) parsers generated by Bison), which are sequential due to their left-to-right state-dependent nature. Although early theoretical studies on parallel parsing algorithms delineated potential speedups on abstract parallel machines using a data-parallel approach, practical developme...

متن کامل

A HPar: A Practical Parallel Parser for HTML –Taming HTML Complexities for Parallel Parsing

Parallelizing HTML parsing is challenging due to the complexities of HTML documents and the inherent dependences in its parsing algorithm. As a result, despite numerous studies in parallel parsing, HTML parsing remains sequential today. It forms one of the final barriers for fully parallelizing browser operations to minimize the browser’s response time—an important variable for user experiences...

متن کامل

A Parallel Extension of Earley’s Parsing Algorithm

Parsing is the process of deriving structure from a string, and can be used to describe the meaning of the string, and the relationships between its elements. This paper describes two popular parsing algorithms, CKY and Earley. This paper also discusses attempts others have made to distribute the processing workload of the CKY algorithm in a parallel environment. The paper then describes how I ...

متن کامل

A Parallel Augmented Context-Free Parsing System For Natural Language Analysis

Parsing efficiency is one of the important issues in building practical natural language processing systems. This paper proposes a design and an implementation of a parallel augmented context-free parsing system for natural language analysis. Natural language grammars are more than context-free, so that unification formalisms are adopted to enforce the linguistic constraints and to transfer the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Sci. Comput. Program.

دوره 112  شماره 

صفحات  -

تاریخ انتشار 2015